AITopics | software artifact

Collaborating Authors

software artifact

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks

Zhou, Xin, Kim, Kisub, Zhang, Ting, Weyssow, Martin, Gomes, Luis F., Yang, Guang, Liu, Kui, Xia, Xin, Lo, David

arXiv.org Artificial IntelligenceOct-13-2025

Large Language Models (LLMs) and other automated techniques have been increasingly used to support software developers by generating software artifacts such as code snippets, patches, and comments. However, accurately assessing the correctness of these generated artifacts remains a significant challenge. On one hand, human evaluation provides high accuracy but is labor-intensive and lacks scalability. On the other hand, many automatic evaluation metrics are scalable and require minimal human effort, but they often fail to accurately reflect the actual correctness of generated software artifacts. In this paper, we present SE-Jury, the first evaluation metric for LLM-as-Ensemble-Judge specifically designed to accurately assess the correctness of generated software artifacts. SE-Jury first defines five distinct evaluation strategies, each implemented by an independent judge. A dynamic team selection mechanism then identifies the most appropriate subset of judges as a team to produce a final correctness score through ensembling. We evaluate SE-Jury across a diverse set of software engineering (SE) benchmarks that span three popular SE tasks: code generation, automated program repair, and code summarization. Results demonstrate that SE-Jury consistently achieves a higher correlation with human judgments, with improvements ranging from 29.6% to 140.8% over existing automatic metrics. SE-Jury reaches agreement levels with human annotators that are close to inter-annotator agreement in code generation and program repair. These findings underscore SE-Jury's potential as a scalable and reliable alternative to human evaluation in these SE tasks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.20854

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Michigan (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Law > International Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Integrating Various Software Artifacts for Better LLM-based Bug Localization and Program Repair

Feng, Qiong, Ma, Xiaotian, Sheng, Jiayi, Feng, Ziyuan, Song, Wei, Liang, Peng

arXiv.org Artificial IntelligenceDec-5-2024

LLMs have garnered considerable attention for their potential to streamline Automated Program Repair (APR). LLM-based approaches can either insert the correct code or directly generate patches when provided with buggy methods. However, most of LLM-based APR methods rely on a single type of software information, without fully leveraging different software artifacts. Despite this, many LLM-based approaches do not explore which specific types of information best assist in APR. Addressing this gap is crucial for advancing LLM-based APR techniques. We propose DEVLoRe to use issue content (description and message) and stack error traces to localize buggy methods, then rely on debug information in buggy methods and issue content and stack error to localize buggy lines and generate plausible patches which can pass all unit tests. The results show that while issue content is particularly effective in assisting LLMs with fault localization and program repair, different types of software artifacts complement each other. By incorporating different artifacts, DEVLoRe successfully locates 49.3% and 47.6% of single and non-single buggy methods and generates 56.0% and 14.5% plausible patches for the Defects4J v2.0 dataset, respectively. This outperforms current state-of-the-art APR methods. The source code and experimental results of this work for replication are available at https://github.com/XYZboom/DEVLoRe.

buggy method, program repair, software artifact, (14 more...)

arXiv.org Artificial Intelligence

2412.03905

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > Middle East > Republic of Türkiye (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

GitLab Acquires UnReview to Further AI Ambitions - DevOps.com

#artificialintelligenceJun-3-2021, 23:44:09 GMT

GitLab announced this week it has acquired UnReview, a provider of a tool that employs machine learning algorithms to both identify which expert code reviewers to assign to a project based on the quality of their previous efforts and current workloads. David DeSanto, senior director for product management at GitLab, said the acquisition of UnReview is the latest step in an AI strategy that, in addition to optimizing DevOps processes, will also eventually unify machine learning operations (MLOps) and DevOps workflows. Accessed via the Dev section of the GitLab platform, UnReview will also be employed to manage the overall code review process. DeSanto said GitLab is committed to employing AI technologies to automate workflows and compressing cycle times across all stages of the DevSecOps life cycle. The goal is to not eliminate the need for DevOps teams but rather eliminate low-level tasks that conspire to hamper productivity, while at the same time improving application security, noted DeSanto.

ai model, software artifact, unreview, (9 more...)

#artificialintelligence

Genre: Financial News (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On Cycling Risk and Discomfort: Urban Safety Mapping and Bike Route Recommendations

Castells-Graells, David, Salahub, Christopher, Pournaras, Evangelos

arXiv.org Machine LearningMay-16-2019

Bike usage in Smart Cities becomes paramount for sustainable urban development. Cycling provides tremendous opportunities for a more healthy lifestyle, lower energy consumption and carbon emissions as well as reduction of traffic jams. While the number of cyclists increase along with the expansion of bike sharing initiatives and infrastructures, the number of bike accidents rises drastically threatening to jeopardize the bike urban movement. This paper studies cycling risk and discomfort using a diverse spectrum of data sources about geolocated bike accidents and their severity. Empirical continuous spatial risk estimations are calculated via kernel density contours that map safety in a case study of Zurich city. The role of weather, time, accident type and severity are illustrated. Given the predominance of self-caused accidents, an open-source software artifact for personalized route recommendations is introduced. The software is also used to collect open baseline route data that are compared with alternative ones that minimize risk or discomfort. These contributions can provide invaluable insights for urban planners to improve infrastructure. They can also improve the risk awareness of existing cyclists' as well as support new cyclists, such as tourists, to safely explore a new urban environment by bike.

accident, artificial intelligence, route recommendation, (10 more...)

arXiv.org Machine Learning

1905.08775

Country:

North America > United States (0.46)
Europe > Switzerland > Zürich > Zürich (0.37)

Genre: Research Report (0.84)

Industry:

Energy (0.74)
Transportation > Infrastructure & Services (0.68)
Health & Medicine > Consumer Health (0.66)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.95)
Information Technology > Communications (0.68)

Add feedback

Building the Universal Archive of Source Code

Communications of the ACMSep-26-2018, 17:27:24 GMT

Software is becoming the fabric that binds our personal and social lives, embodying a vast part of the technological knowledge that powers our industry and fuels innovation. Software is a pillar of most scientific research activities in all fields, from mathematics to physics, from chemistry to biology, from finance to social sciences. Software is also an essential mediator for accessing any digital information. In short, a rapidly increasing part of our collective knowledge is embodied in, or dependent on, software artifacts. Our ability to design, use, understand, adapt, and evolve systems and devices on which our lives have come to depend relies on our ability to understand, adapt, and evolve the source code of the software that controls them.

artificial intelligence, programming language, source code, (15 more...)

Communications of the ACM

Country: Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)

Technology:

Information Technology > Software Engineering (0.98)
Information Technology > Artificial Intelligence (0.70)
Information Technology > Software > Programming Languages (0.51)

Add feedback